Overview

Dataset statistics

Number of variables26
Number of observations933973
Missing cells158318
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory531.3 MiB
Average record size in memory596.5 B

Variable types

Numeric11
Categorical14
Boolean1

Alerts

insurance_amt has constant value "0" Constant
finance_chgs has constant value "0" Constant
installment_chgs has constant value "0" Constant
tracker_chgs has constant value "0" Constant
insurance_recv has constant value "0" Constant
finance_chgs_recv has constant value "0" Constant
installment_chgs_recv has constant value "0" Constant
tracker_chgs_recv has constant value "0" Constant
install_due_date has a high cardinality: 3421 distinct values High cardinality
installment receiving date has a high cardinality: 2403 distinct values High cardinality
Loan Settelment_Date has a high cardinality: 99753 distinct values High cardinality
unit_id is highly correlated with insurance_amt and 7 other fieldsHigh correlation
business_proposal_no is highly correlated with ProductDescriptionHigh correlation
sub_proposal_no is highly correlated with insurance_amt and 7 other fieldsHigh correlation
install_no is highly correlated with markup_amtHigh correlation
principal_amt is highly correlated with markup_amt and 3 other fieldsHigh correlation
markup_amt is highly correlated with principal_amt and 4 other fieldsHigh correlation
principal_recv is highly correlated with principal_amt and 3 other fieldsHigh correlation
markup_recv is highly correlated with principal_amt and 5 other fieldsHigh correlation
os_principal is highly correlated with markup_recv and 2 other fieldsHigh correlation
os_markup is highly correlated with markup_recv and 1 other fieldsHigh correlation
session_id is highly correlated with business_proposal_noHigh correlation
insurance_amt is highly correlated with Status and 10 other fieldsHigh correlation
finance_chgs is highly correlated with insurance_amt and 10 other fieldsHigh correlation
installment_chgs is highly correlated with insurance_amt and 10 other fieldsHigh correlation
tracker_chgs is highly correlated with insurance_amt and 10 other fieldsHigh correlation
insurance_recv is highly correlated with insurance_amt and 10 other fieldsHigh correlation
finance_chgs_recv is highly correlated with insurance_amt and 10 other fieldsHigh correlation
installment_chgs_recv is highly correlated with insurance_amt and 10 other fieldsHigh correlation
tracker_chgs_recv is highly correlated with insurance_amt and 10 other fieldsHigh correlation
os_install_flag is highly correlated with markup_amt and 2 other fieldsHigh correlation
no_of_receipts is highly correlated with ProductDescriptionHigh correlation
Status is highly correlated with os_install_flagHigh correlation
ProductDescription is highly correlated with business_proposal_no and 7 other fieldsHigh correlation
installment receiving date has 158318 (17.0%) missing values Missing
session_id is highly skewed (γ1 = 23.00611905) Skewed
principal_amt has 30288 (3.2%) zeros Zeros
markup_amt has 19212 (2.1%) zeros Zeros
principal_recv has 178678 (19.1%) zeros Zeros
markup_recv has 170046 (18.2%) zeros Zeros
os_principal has 141632 (15.2%) zeros Zeros
os_markup has 137489 (14.7%) zeros Zeros

Reproduction

Analysis started2022-11-08 09:10:23.008550
Analysis finished2022-11-08 09:12:13.947584
Duration1 minute and 50.94 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

unit_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct124
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2523.575249
Minimum205
Maximum9042
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 MiB
2022-11-08T14:12:14.028207image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum205
5-th percentile217
Q1319
median612
Q3708
95-th percentile9039
Maximum9042
Range8837
Interquartile range (IQR)389

Descriptive statistics

Standard deviation3635.383255
Coefficient of variation (CV)1.440568597
Kurtosis-0.4937699652
Mean2523.575249
Median Absolute Deviation (MAD)208
Skewness1.223100723
Sum2356951146
Variance13216011.41
MonotonicityNot monotonic
2022-11-08T14:12:14.136177image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
703106690
 
11.4%
30151412
 
5.5%
70143539
 
4.7%
60342240
 
4.5%
904142141
 
4.5%
20936038
 
3.9%
900333255
 
3.6%
61224278
 
2.6%
30522292
 
2.4%
31321538
 
2.3%
Other values (114)510550
54.7%
ValueCountFrequency (%)
2051003
 
0.1%
20936038
3.9%
2101250
 
0.1%
2132599
 
0.3%
2142377
 
0.3%
2162765
 
0.3%
2171924
 
0.2%
2182397
 
0.3%
2197735
 
0.8%
2202372
 
0.3%
ValueCountFrequency (%)
9042985
 
0.1%
904142141
4.5%
90402002
 
0.2%
90391885
 
0.2%
90382546
 
0.3%
90375447
 
0.6%
90364210
 
0.5%
90352195
 
0.2%
90341272
 
0.1%
90332074
 
0.2%

business_proposal_no
Real number (ℝ≥0)

HIGH CORRELATION

Distinct109476
Distinct (%)11.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291064.8987
Minimum2
Maximum715816
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 MiB
2022-11-08T14:12:14.285346image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile16094
Q1103262
median232598
Q3533243
95-th percentile625792
Maximum715816
Range715814
Interquartile range (IQR)429981

Descriptive statistics

Standard deviation211295.7082
Coefficient of variation (CV)0.7259401911
Kurtosis-1.331585463
Mean291064.8987
Median Absolute Deviation (MAD)163561
Skewness0.3471036869
Sum2.718467567 × 1011
Variance4.464587631 × 1010
MonotonicityNot monotonic
2022-11-08T14:12:14.414970image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28363788
 
< 0.1%
28390287
 
< 0.1%
23068684
 
< 0.1%
23137183
 
< 0.1%
22966581
 
< 0.1%
22468981
 
< 0.1%
18337181
 
< 0.1%
28376580
 
< 0.1%
22587180
 
< 0.1%
22654779
 
< 0.1%
Other values (109466)933149
99.9%
ValueCountFrequency (%)
230
< 0.1%
1318
< 0.1%
2418
< 0.1%
2518
< 0.1%
4612
 
< 0.1%
5633
< 0.1%
7515
< 0.1%
8230
< 0.1%
10318
< 0.1%
10818
< 0.1%
ValueCountFrequency (%)
71581612
< 0.1%
71429612
< 0.1%
71107112
< 0.1%
7109961
 
< 0.1%
7104301
 
< 0.1%
70935112
< 0.1%
70902312
< 0.1%
70773012
< 0.1%
70725112
< 0.1%
70584218
< 0.1%

sub_proposal_no
Real number (ℝ≥0)

HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.992615418
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 MiB
2022-11-08T14:12:14.533929image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum16
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.564502029
Coefficient of variation (CV)0.7851500168
Kurtosis7.092219018
Mean1.992615418
Median Absolute Deviation (MAD)0
Skewness2.351733863
Sum1861049
Variance2.447666599
MonotonicityNot monotonic
2022-11-08T14:12:14.647793image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
1509346
54.5%
2203954
21.8%
399244
 
10.6%
451413
 
5.5%
529937
 
3.2%
617245
 
1.8%
79968
 
1.1%
85515
 
0.6%
93396
 
0.4%
102003
 
0.2%
Other values (6)1952
 
0.2%
ValueCountFrequency (%)
1509346
54.5%
2203954
21.8%
399244
 
10.6%
451413
 
5.5%
529937
 
3.2%
617245
 
1.8%
79968
 
1.1%
85515
 
0.6%
93396
 
0.4%
102003
 
0.2%
ValueCountFrequency (%)
1612
 
< 0.1%
1524
 
< 0.1%
1468
 
< 0.1%
13269
 
< 0.1%
12396
 
< 0.1%
111183
 
0.1%
102003
 
0.2%
93396
 
0.4%
85515
0.6%
79968
1.1%

install_no
Real number (ℝ≥0)

HIGH CORRELATION

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.79917942
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 MiB
2022-11-08T14:12:14.765556image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q310
95-th percentile15
Maximum36
Range35
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.502865684
Coefficient of variation (CV)0.6622660481
Kurtosis0.2945002387
Mean6.79917942
Median Absolute Deviation (MAD)4
Skewness0.654949364
Sum6350250
Variance20.27579937
MonotonicityNot monotonic
2022-11-08T14:12:14.890124image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
1137345
14.7%
269625
 
7.5%
366187
 
7.1%
465873
 
7.1%
565717
 
7.0%
665543
 
7.0%
765320
 
7.0%
865315
 
7.0%
965310
 
7.0%
1065251
 
7.0%
Other values (26)202487
21.7%
ValueCountFrequency (%)
1137345
14.7%
269625
7.5%
366187
7.1%
465873
7.1%
565717
7.0%
665543
7.0%
765320
7.0%
865315
7.0%
965310
7.0%
1065251
7.0%
ValueCountFrequency (%)
361
 
< 0.1%
352
 
< 0.1%
347
 
< 0.1%
3311
 
< 0.1%
3212
 
< 0.1%
3118
 
< 0.1%
3020
 
< 0.1%
2930
 
< 0.1%
2873
< 0.1%
27164
< 0.1%

install_due_date
Categorical

HIGH CARDINALITY

Distinct3421
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size72.1 MiB
[2019/07/15:12:00:00 AM]
 
3770
[2019/11/11:12:00:00 AM]
 
3716
[2019/06/10:12:00:00 AM]
 
3637
[2019/09/10:12:00:00 AM]
 
3377
[2019/12/16:12:00:00 AM]
 
3366
Other values (3416)
916107 

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters22415352
Distinct characters17
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique397 ?
Unique (%)< 0.1%

Sample

1st row[2014/03/07:12:00:00 AM]
2nd row[2014/04/07:12:00:00 AM]
3rd row[2014/05/07:12:00:00 AM]
4th row[2014/06/07:12:00:00 AM]
5th row[2014/07/07:12:00:00 AM]

Common Values

ValueCountFrequency (%)
[2019/07/15:12:00:00 AM]3770
 
0.4%
[2019/11/11:12:00:00 AM]3716
 
0.4%
[2019/06/10:12:00:00 AM]3637
 
0.4%
[2019/09/10:12:00:00 AM]3377
 
0.4%
[2019/12/16:12:00:00 AM]3366
 
0.4%
[2019/08/10:12:00:00 AM]3366
 
0.4%
[2019/10/10:12:00:00 AM]3324
 
0.4%
[2019/08/15:12:00:00 AM]3252
 
0.3%
[2019/07/10:12:00:00 AM]3246
 
0.3%
[2019/01/10:12:00:00 AM]3170
 
0.3%
Other values (3411)899749
96.3%

Length

2022-11-08T14:12:15.003634image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
am933973
50.0%
2019/07/15:12:00:003770
 
0.2%
2019/11/11:12:00:003716
 
0.2%
2019/06/10:12:00:003637
 
0.2%
2019/09/10:12:00:003377
 
0.2%
2019/12/16:12:00:003366
 
0.2%
2019/08/10:12:00:003366
 
0.2%
2019/10/10:12:00:003324
 
0.2%
2019/08/15:12:00:003252
 
0.2%
2019/07/10:12:00:003246
 
0.2%
Other values (3412)902919
48.3%

Most occurring characters

ValueCountFrequency (%)
06048323
27.0%
:2801919
12.5%
12677045
11.9%
22501967
11.2%
/1867946
 
8.3%
933973
 
4.2%
]933973
 
4.2%
M933973
 
4.2%
A933973
 
4.2%
[933973
 
4.2%
Other values (7)1848287
 
8.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number13075622
58.3%
Other Punctuation4669865
 
20.8%
Uppercase Letter1867946
 
8.3%
Space Separator933973
 
4.2%
Close Punctuation933973
 
4.2%
Open Punctuation933973
 
4.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
06048323
46.3%
12677045
20.5%
22501967
19.1%
9458588
 
3.5%
8354652
 
2.7%
5308938
 
2.4%
6215619
 
1.6%
7187191
 
1.4%
4168484
 
1.3%
3154815
 
1.2%
Other Punctuation
ValueCountFrequency (%)
:2801919
60.0%
/1867946
40.0%
Uppercase Letter
ValueCountFrequency (%)
M933973
50.0%
A933973
50.0%
Space Separator
ValueCountFrequency (%)
933973
100.0%
Close Punctuation
ValueCountFrequency (%)
]933973
100.0%
Open Punctuation
ValueCountFrequency (%)
[933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common20547406
91.7%
Latin1867946
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
06048323
29.4%
:2801919
13.6%
12677045
13.0%
22501967
12.2%
/1867946
 
9.1%
933973
 
4.5%
]933973
 
4.5%
[933973
 
4.5%
9458588
 
2.2%
8354652
 
1.7%
Other values (5)1035047
 
5.0%
Latin
ValueCountFrequency (%)
M933973
50.0%
A933973
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII22415352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
06048323
27.0%
:2801919
12.5%
12677045
11.9%
22501967
11.2%
/1867946
 
8.3%
933973
 
4.2%
]933973
 
4.2%
M933973
 
4.2%
A933973
 
4.2%
[933973
 
4.2%
Other values (7)1848287
 
8.2%

principal_amt
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct84625
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15623.19152
Minimum0
Maximum500000
Zeros30288
Zeros (%)3.2%
Negative0
Negative (%)0.0%
Memory size7.1 MiB
2022-11-08T14:12:15.122127image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2748
Q14674
median7245
Q311607.19
95-th percentile100000
Maximum500000
Range500000
Interquartile range (IQR)6933.19

Descriptive statistics

Standard deviation27785.52698
Coefficient of variation (CV)1.778479572
Kurtosis12.88508692
Mean15623.19152
Median Absolute Deviation (MAD)2974.19
Skewness3.570812243
Sum1.459163906 × 1010
Variance772035509.7
MonotonicityNot monotonic
2022-11-08T14:12:15.261909image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
030288
 
3.2%
10000020232
 
2.2%
15000017931
 
1.9%
42316319
 
0.7%
48906314
 
0.7%
500005400
 
0.6%
33515287
 
0.6%
47555256
 
0.6%
34845248
 
0.6%
45745244
 
0.6%
Other values (84615)826454
88.5%
ValueCountFrequency (%)
030288
3.2%
1.661
 
< 0.1%
8.311
 
< 0.1%
91
 
< 0.1%
9.621
 
< 0.1%
11.371
 
< 0.1%
121
 
< 0.1%
18.081
 
< 0.1%
211
 
< 0.1%
27.251
 
< 0.1%
ValueCountFrequency (%)
5000001
 
< 0.1%
40000014
< 0.1%
3900003
 
< 0.1%
3250002
 
< 0.1%
3200001
 
< 0.1%
3000007
< 0.1%
2600002
 
< 0.1%
252001.441
 
< 0.1%
2500003
 
< 0.1%
228586.961
 
< 0.1%

markup_amt
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct95952
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4746.85931
Minimum-5447
Maximum169521
Zeros19212
Zeros (%)2.1%
Negative308
Negative (%)< 0.1%
Memory size7.1 MiB
2022-11-08T14:12:15.428786image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-5447
5-th percentile336
Q11102
median1986.5
Q33781.4
95-th percentile25632
Maximum169521
Range174968
Interquartile range (IQR)2679.4

Descriptive statistics

Standard deviation8919.5076
Coefficient of variation (CV)1.879033487
Kurtosis15.62255874
Mean4746.85931
Median Absolute Deviation (MAD)1147.5
Skewness3.782780903
Sum4433438430
Variance79557615.83
MonotonicityNot monotonic
2022-11-08T14:12:15.574729image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
019212
 
2.1%
9345468
 
0.6%
4475464
 
0.6%
14185404
 
0.6%
3905304
 
0.6%
12635296
 
0.6%
11025293
 
0.6%
15675285
 
0.6%
5785277
 
0.6%
18495260
 
0.6%
Other values (95942)866710
92.8%
ValueCountFrequency (%)
-54471
 
< 0.1%
-45932
 
< 0.1%
-43541
 
< 0.1%
-36051
 
< 0.1%
-35951
 
< 0.1%
-34681
 
< 0.1%
-32571
 
< 0.1%
-31201
 
< 0.1%
-30471
 
< 0.1%
-30086
< 0.1%
ValueCountFrequency (%)
1695211
< 0.1%
1415381
< 0.1%
1360002
< 0.1%
1320001
< 0.1%
1308051
< 0.1%
128586.961
< 0.1%
1156711
< 0.1%
1146001
< 0.1%
1141941
< 0.1%
1140002
< 0.1%

insurance_amt
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:15.693273image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:15.803480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

finance_chgs
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:15.900191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:16.004334image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

installment_chgs
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:16.097166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:16.220264image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

tracker_chgs
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:16.319328image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:16.419510image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

principal_recv
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct71739
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8648.759114
Minimum0
Maximum393699
Zeros178678
Zeros (%)19.1%
Negative0
Negative (%)0.0%
Memory size7.1 MiB
2022-11-08T14:12:16.684687image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13484
median5575
Q39237
95-th percentile21199
Maximum393699
Range393699
Interquartile range (IQR)5753

Descriptive statistics

Standard deviation15916.15428
Coefficient of variation (CV)1.840281833
Kurtosis47.26952118
Mean8648.759114
Median Absolute Deviation (MAD)2889
Skewness6.33987391
Sum8077707496
Variance253323966.9
MonotonicityNot monotonic
2022-11-08T14:12:16.838930image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0178678
 
19.1%
42316264
 
0.7%
33515283
 
0.6%
34845242
 
0.6%
36225220
 
0.6%
37665211
 
0.6%
39155204
 
0.6%
40705194
 
0.6%
43995174
 
0.6%
45745048
 
0.5%
Other values (71729)707455
75.7%
ValueCountFrequency (%)
0178678
19.1%
0.381
 
< 0.1%
0.651
 
< 0.1%
11
 
< 0.1%
1.131
 
< 0.1%
1.251
 
< 0.1%
1.422
 
< 0.1%
1.51
 
< 0.1%
1.841
 
< 0.1%
3.42
 
< 0.1%
ValueCountFrequency (%)
3936991
< 0.1%
3900001
< 0.1%
3250002
< 0.1%
3003351
< 0.1%
3000001
< 0.1%
2769761
< 0.1%
2600001
< 0.1%
253180.751
< 0.1%
2500001
< 0.1%
2099181
< 0.1%

markup_recv
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct75161
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2842.69503
Minimum-5447
Maximum132000
Zeros170046
Zeros (%)18.2%
Negative170
Negative (%)< 0.1%
Memory size7.1 MiB
2022-11-08T14:12:16.966087image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-5447
5-th percentile0
Q1615
median1682.79
Q33134
95-th percentile8006.91
Maximum132000
Range137447
Interquartile range (IQR)2519

Descriptive statistics

Standard deviation5358.862554
Coefficient of variation (CV)1.885134528
Kurtosis42.40248814
Mean2842.69503
Median Absolute Deviation (MAD)1224.79
Skewness5.888681283
Sum2655000406
Variance28717407.87
MonotonicityNot monotonic
2022-11-08T14:12:17.093913image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0170046
 
18.2%
14185381
 
0.6%
9345308
 
0.6%
15675272
 
0.6%
12635258
 
0.6%
18495253
 
0.6%
11025238
 
0.6%
17115219
 
0.6%
7595063
 
0.5%
5784581
 
0.5%
Other values (75151)717354
76.8%
ValueCountFrequency (%)
-54471
 
< 0.1%
-45932
 
< 0.1%
-30085
< 0.1%
-27834
< 0.1%
-27371
 
< 0.1%
-26041
 
< 0.1%
-24241
 
< 0.1%
-22822
 
< 0.1%
-20061
 
< 0.1%
-20001
 
< 0.1%
ValueCountFrequency (%)
1320001
< 0.1%
128586.961
< 0.1%
109578.421
< 0.1%
104701.441
< 0.1%
1007501
< 0.1%
916161
< 0.1%
908931
< 0.1%
90314.381
< 0.1%
802551
< 0.1%
795321
< 0.1%

insurance_recv
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:17.233922image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:17.339373image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

finance_chgs_recv
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:17.431315image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:17.541919image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

installment_chgs_recv
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:17.634582image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:17.735731image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

tracker_chgs_recv
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
933973 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0933973
100.0%

Length

2022-11-08T14:12:17.828959image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:17.929509image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0933973
100.0%

Most occurring characters

ValueCountFrequency (%)
0933973
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number933973
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common933973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0933973
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0933973
100.0%

os_principal
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct87864
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51540.6945
Minimum-15198
Maximum742642.32
Zeros141632
Zeros (%)15.2%
Negative330
Negative (%)< 0.1%
Memory size7.1 MiB
2022-11-08T14:12:18.040861image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-15198
5-th percentile0
Q114315.73
median37545
Q370055
95-th percentile157111
Maximum742642.32
Range757840.32
Interquartile range (IQR)55739.27

Descriptive statistics

Standard deviation57267.62988
Coefficient of variation (CV)1.111114828
Kurtosis8.958101204
Mean51540.6945
Median Absolute Deviation (MAD)26545.62
Skewness2.471210633
Sum4.813761706 × 1010
Variance3279581432
MonotonicityNot monotonic
2022-11-08T14:12:18.176520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0141632
 
15.2%
191625340
 
0.6%
466495222
 
0.6%
357775221
 
0.6%
431655221
 
0.6%
48905221
 
0.6%
98335221
 
0.6%
145885221
 
0.6%
235615221
 
0.6%
277925221
 
0.6%
Other values (87854)745232
79.8%
ValueCountFrequency (%)
-151984
< 0.1%
-146784
< 0.1%
-140434
< 0.1%
-135354
< 0.1%
-131284
< 0.1%
-125714
< 0.1%
-122284
< 0.1%
-116754
< 0.1%
-112534
< 0.1%
-109974
< 0.1%
ValueCountFrequency (%)
742642.321
< 0.1%
683780.691
< 0.1%
6383191
< 0.1%
625904.631
< 0.1%
623375.691
< 0.1%
613676.071
< 0.1%
601169.621
< 0.1%
587938.281
< 0.1%
575277.941
< 0.1%
561055.491
< 0.1%

os_markup
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct123635
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11012.99922
Minimum-5447
Maximum351841.81
Zeros137489
Zeros (%)14.7%
Negative364
Negative (%)< 0.1%
Memory size7.1 MiB
2022-11-08T14:12:18.318191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-5447
5-th percentile0
Q11227
median5806
Q313783
95-th percentile39326.04
Maximum351841.81
Range357288.81
Interquartile range (IQR)12556

Descriptive statistics

Standard deviation16490.48086
Coefficient of variation (CV)1.497365116
Kurtosis25.79956055
Mean11012.99922
Median Absolute Deviation (MAD)5320
Skewness3.946335704
Sum1.028584392 × 1010
Variance271935958.9
MonotonicityNot monotonic
2022-11-08T14:12:18.465493image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0137489
 
14.7%
21745313
 
0.6%
68915292
 
0.6%
84585247
 
0.6%
14155235
 
0.6%
8375228
 
0.6%
54735226
 
0.6%
31085225
 
0.6%
101695223
 
0.6%
42105223
 
0.6%
Other values (123625)749272
80.2%
ValueCountFrequency (%)
-54471
< 0.1%
-45932
< 0.1%
-43541
< 0.1%
-36051
< 0.1%
-35951
< 0.1%
-34681
< 0.1%
-32841
< 0.1%
-32571
< 0.1%
-31201
< 0.1%
-30471
< 0.1%
ValueCountFrequency (%)
351841.811
< 0.1%
342217.611
< 0.1%
336772.941
< 0.1%
324441.141
< 0.1%
321391.371
< 0.1%
308521.721
< 0.1%
306009.81
< 0.1%
292983.411
< 0.1%
291124.411
< 0.1%
2874801
< 0.1%

os_install_flag
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size912.2 KiB
False
775656 
True
158317 
ValueCountFrequency (%)
False775656
83.0%
True158317
 
17.0%
2022-11-08T14:12:18.616087image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

installment receiving date
Categorical

HIGH CARDINALITY
MISSING

Distinct2403
Distinct (%)0.3%
Missing158318
Missing (%)17.0%
Memory size64.7 MiB
[2019/08/16:12:00:00 AM]
 
4432
[2019/04/15:12:00:00 AM]
 
3153
[2019/06/10:12:00:00 AM]
 
3118
[2019/09/11:12:00:00 AM]
 
3113
[2020/07/06:12:00:00 AM]
 
2931
Other values (2398)
758908 

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters18615720
Distinct characters17
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique192 ?
Unique (%)< 0.1%

Sample

1st row[2014/03/08:12:00:00 AM]
2nd row[2014/04/07:12:00:00 AM]
3rd row[2014/05/07:12:00:00 AM]
4th row[2014/06/10:12:00:00 AM]
5th row[2014/07/08:12:00:00 AM]

Common Values

ValueCountFrequency (%)
[2019/08/16:12:00:00 AM]4432
 
0.5%
[2019/04/15:12:00:00 AM]3153
 
0.3%
[2019/06/10:12:00:00 AM]3118
 
0.3%
[2019/09/11:12:00:00 AM]3113
 
0.3%
[2020/07/06:12:00:00 AM]2931
 
0.3%
[2018/06/19:12:00:00 AM]2888
 
0.3%
[2019/02/11:12:00:00 AM]2858
 
0.3%
[2019/09/16:12:00:00 AM]2854
 
0.3%
[2019/07/15:12:00:00 AM]2818
 
0.3%
[2019/08/19:12:00:00 AM]2747
 
0.3%
Other values (2393)744743
79.7%
(Missing)158318
 
17.0%

Length

2022-11-08T14:12:18.721913image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
am775655
50.0%
2019/08/16:12:00:004432
 
0.3%
2019/04/15:12:00:003153
 
0.2%
2019/06/10:12:00:003118
 
0.2%
2019/09/11:12:00:003113
 
0.2%
2020/07/06:12:00:002931
 
0.2%
2018/06/19:12:00:002888
 
0.2%
2019/02/11:12:00:002858
 
0.2%
2019/09/16:12:00:002854
 
0.2%
2019/07/15:12:00:002818
 
0.2%
Other values (2394)747490
48.2%

Most occurring characters

ValueCountFrequency (%)
04918669
26.4%
:2326965
12.5%
12221800
11.9%
22086627
11.2%
/1551310
 
8.3%
775655
 
4.2%
]775655
 
4.2%
M775655
 
4.2%
A775655
 
4.2%
[775655
 
4.2%
Other values (7)1632074
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10859170
58.3%
Other Punctuation3878275
 
20.8%
Uppercase Letter1551310
 
8.3%
Space Separator775655
 
4.2%
Close Punctuation775655
 
4.2%
Open Punctuation775655
 
4.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04918669
45.3%
12221800
20.5%
22086627
19.2%
9397592
 
3.7%
8334136
 
3.1%
5225466
 
2.1%
6194007
 
1.8%
7185891
 
1.7%
4151315
 
1.4%
3143667
 
1.3%
Other Punctuation
ValueCountFrequency (%)
:2326965
60.0%
/1551310
40.0%
Uppercase Letter
ValueCountFrequency (%)
M775655
50.0%
A775655
50.0%
Space Separator
ValueCountFrequency (%)
775655
100.0%
Close Punctuation
ValueCountFrequency (%)
]775655
100.0%
Open Punctuation
ValueCountFrequency (%)
[775655
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common17064410
91.7%
Latin1551310
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
04918669
28.8%
:2326965
13.6%
12221800
13.0%
22086627
12.2%
/1551310
 
9.1%
775655
 
4.5%
]775655
 
4.5%
[775655
 
4.5%
9397592
 
2.3%
8334136
 
2.0%
Other values (5)900346
 
5.3%
Latin
ValueCountFrequency (%)
M775655
50.0%
A775655
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII18615720
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
04918669
26.4%
:2326965
12.5%
12221800
11.9%
22086627
11.2%
/1551310
 
8.3%
775655
 
4.2%
]775655
 
4.2%
M775655
 
4.2%
A775655
 
4.2%
[775655
 
4.2%
Other values (7)1632074
 
8.8%

no_of_receipts
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
0
543315 
1
388385 
77
 
2273

Length

Max length2
Median length1
Mean length1.002433689
Min length1

Characters and Unicode

Total characters936246
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0543315
58.2%
1388385
41.6%
772273
 
0.2%

Length

2022-11-08T14:12:18.830697image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:18.948268image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0543315
58.2%
1388385
41.6%
772273
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0543315
58.0%
1388385
41.5%
74546
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number936246
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0543315
58.0%
1388385
41.5%
74546
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common936246
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0543315
58.0%
1388385
41.5%
74546
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII936246
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0543315
58.0%
1388385
41.5%
74546
 
0.5%

session_id
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct88569
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1300791.729
Minimum7
Maximum99099099
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 MiB
2022-11-08T14:12:19.098454image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile34663
Q1439770
median586951
Q31769027
95-th percentile3904137
Maximum99099099
Range99099092
Interquartile range (IQR)1329257

Descriptive statistics

Standard deviation3827095.925
Coefficient of variation (CV)2.94212812
Kurtosis584.1910437
Mean1300791.729
Median Absolute Deviation (MAD)436499
Skewness23.00611905
Sum1.214904354 × 1012
Variance1.464666322 × 1013
MonotonicityNot monotonic
2022-11-08T14:12:19.233964image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
58695197616
 
10.5%
43979843060
 
4.6%
43979737710
 
4.0%
50003937179
 
4.0%
49966533086
 
3.5%
65377032044
 
3.4%
43977027814
 
3.0%
54638919131
 
2.0%
346358490
 
0.9%
346718199
 
0.9%
Other values (88559)589644
63.1%
ValueCountFrequency (%)
73989
0.4%
2351
 
< 0.1%
2851
 
< 0.1%
3892
 
< 0.1%
40422
 
< 0.1%
4602
 
< 0.1%
7531
 
< 0.1%
8375
 
< 0.1%
8983
 
< 0.1%
9451
 
< 0.1%
ValueCountFrequency (%)
990990991286
0.1%
202003111
 
< 0.1%
47212041
 
< 0.1%
47211321
 
< 0.1%
47210044
 
< 0.1%
47209702
 
< 0.1%
47206932
 
< 0.1%
47205901
 
< 0.1%
47205472
 
< 0.1%
47205301
 
< 0.1%

Status
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.7 MiB
E
573814 
X
360159 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters933973
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
E573814
61.4%
X360159
38.6%

Length

2022-11-08T14:12:19.352192image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T14:12:19.463941image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
e573814
61.4%
x360159
38.6%

Most occurring characters

ValueCountFrequency (%)
E573814
61.4%
X360159
38.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter933973
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E573814
61.4%
X360159
38.6%

Most occurring scripts

ValueCountFrequency (%)
Latin933973
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E573814
61.4%
X360159
38.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII933973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E573814
61.4%
X360159
38.6%

Loan Settelment_Date
Categorical

HIGH CARDINALITY

Distinct99753
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Memory size72.1 MiB
[2018/06/19:12:00:00 AM]
 
902
[2019/11/14:12:00:00 AM]
 
893
[2018/02/28:12:00:00 AM]
 
818
[2018/12/31:12:00:00 AM]
 
787
[2018/11/12:12:00:00 AM]
 
749
Other values (99748)
929824 

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters22415352
Distinct characters18
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53149 ?
Unique (%)5.7%

Sample

1st row[2015/02/16:12:00:00 AM]
2nd row[2015/02/16:12:00:00 AM]
3rd row[2015/02/16:12:00:00 AM]
4th row[2015/02/16:12:00:00 AM]
5th row[2015/02/16:12:00:00 AM]

Common Values

ValueCountFrequency (%)
[2018/06/19:12:00:00 AM]902
 
0.1%
[2019/11/14:12:00:00 AM]893
 
0.1%
[2018/02/28:12:00:00 AM]818
 
0.1%
[2018/12/31:12:00:00 AM]787
 
0.1%
[2018/11/12:12:00:00 AM]749
 
0.1%
[2018/10/17:12:00:00 AM]744
 
0.1%
[2019/11/15:12:00:00 AM]715
 
0.1%
[2016/10/17:12:00:00 AM]705
 
0.1%
[2018/07/17:12:00:00 AM]704
 
0.1%
[2018/04/30:12:00:00 AM]701
 
0.1%
Other values (99743)926255
99.2%

Length

2022-11-08T14:12:19.562938image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
am516327
27.6%
pm417646
22.4%
2018/06/19:12:00:00902
 
< 0.1%
2019/11/14:12:00:00893
 
< 0.1%
2018/02/28:12:00:00818
 
< 0.1%
2018/12/31:12:00:00787
 
< 0.1%
2018/11/12:12:00:00749
 
< 0.1%
2018/10/17:12:00:00744
 
< 0.1%
2019/11/15:12:00:00715
 
< 0.1%
2016/10/17:12:00:00705
 
< 0.1%
Other values (99729)927660
49.7%

Most occurring characters

ValueCountFrequency (%)
04373506
19.5%
:2801919
12.5%
22572566
11.5%
12571689
11.5%
/1867946
8.3%
[933973
 
4.2%
]933973
 
4.2%
M933973
 
4.2%
933973
 
4.2%
9678044
 
3.0%
Other values (8)3813790
17.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number13075622
58.3%
Other Punctuation4669865
 
20.8%
Uppercase Letter1867946
 
8.3%
Open Punctuation933973
 
4.2%
Close Punctuation933973
 
4.2%
Space Separator933973
 
4.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04373506
33.4%
22572566
19.7%
12571689
19.7%
9678044
 
5.2%
5599821
 
4.6%
3596840
 
4.6%
4554704
 
4.2%
8447844
 
3.4%
6377475
 
2.9%
7303133
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
M933973
50.0%
A516327
27.6%
P417646
22.4%
Other Punctuation
ValueCountFrequency (%)
:2801919
60.0%
/1867946
40.0%
Open Punctuation
ValueCountFrequency (%)
[933973
100.0%
Close Punctuation
ValueCountFrequency (%)
]933973
100.0%
Space Separator
ValueCountFrequency (%)
933973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common20547406
91.7%
Latin1867946
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
04373506
21.3%
:2801919
13.6%
22572566
12.5%
12571689
12.5%
/1867946
9.1%
[933973
 
4.5%
]933973
 
4.5%
933973
 
4.5%
9678044
 
3.3%
5599821
 
2.9%
Other values (5)2279996
11.1%
Latin
ValueCountFrequency (%)
M933973
50.0%
A516327
27.6%
P417646
22.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII22415352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
04373506
19.5%
:2801919
12.5%
22572566
11.5%
12571689
11.5%
/1867946
8.3%
[933973
 
4.2%
]933973
 
4.2%
M933973
 
4.2%
933973
 
4.2%
9678044
 
3.0%
Other values (8)3813790
17.0%

ProductDescription
Categorical

HIGH CORRELATION

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size76.4 MiB
FINCA KAROBARI KARZA BASIC
369917 
FINCA KAROBARI KARZA BASIC 12 MONTHS
141831 
FINCA MAWESHI KARZA BASIC
90532 
FINCA MAWESHI KARZA BASIC 12 MONTHS
70989 
FINCA KAROBARI KARZA PLUS
50756 
Other values (31)
209948 

Length

Max length45
Median length44
Mean length28.81792514
Min length18

Characters and Unicode

Total characters26915164
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFINCA KAROBARI KARZA BASIC
2nd rowFINCA KAROBARI KARZA BASIC
3rd rowFINCA KAROBARI KARZA BASIC
4th rowFINCA KAROBARI KARZA BASIC
5th rowFINCA KAROBARI KARZA BASIC

Common Values

ValueCountFrequency (%)
FINCA KAROBARI KARZA BASIC369917
39.6%
FINCA KAROBARI KARZA BASIC 12 MONTHS141831
 
15.2%
FINCA MAWESHI KARZA BASIC90532
 
9.7%
FINCA MAWESHI KARZA BASIC 12 MONTHS70989
 
7.6%
FINCA KAROBARI KARZA PLUS50756
 
5.4%
FINCA KASHTKAR KARZA BASIC45007
 
4.8%
FINCA MAWESHI KARZA FATTENING28344
 
3.0%
FINCA MAWESHI KARZA PLUS 27961
 
3.0%
FINCA NISWAN KARZA 12 MONTHS23058
 
2.5%
FINCA KAROBARI KARZA PLUS 12 MONTHS21674
 
2.3%
Other values (26)63904
 
6.8%

Length

2022-11-08T14:12:19.676979image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
finca933537
21.5%
karza933443
21.5%
basic735567
17.0%
karobari604192
13.9%
months316447
 
7.3%
12273640
 
6.3%
maweshi246297
 
5.7%
plus131844
 
3.0%
kashtkar46984
 
1.1%
niswan33032
 
0.8%
Other values (21)79465
 
1.8%

Most occurring characters

ValueCountFrequency (%)
A5154578
19.2%
3428496
12.7%
I2587057
9.6%
R2189035
8.1%
C1669540
 
6.2%
K1631603
 
6.1%
S1513007
 
5.6%
N1373406
 
5.1%
B1339795
 
5.0%
F966752
 
3.6%
Other values (23)5061895
18.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter22863102
84.9%
Space Separator3428496
 
12.7%
Decimal Number623347
 
2.3%
Dash Punctuation161
 
< 0.1%
Other Punctuation58
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A5154578
22.5%
I2587057
11.3%
R2189035
9.6%
C1669540
 
7.3%
K1631603
 
7.1%
S1513007
 
6.6%
N1373406
 
6.0%
B1339795
 
5.9%
F966752
 
4.2%
Z936381
 
4.1%
Other values (13)3501948
15.3%
Decimal Number
ValueCountFrequency (%)
1306361
49.1%
2274143
44.0%
818554
 
3.0%
514167
 
2.3%
69583
 
1.5%
4503
 
0.1%
336
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3428496
100.0%
Dash Punctuation
ValueCountFrequency (%)
-161
100.0%
Other Punctuation
ValueCountFrequency (%)
/58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin22863102
84.9%
Common4052062
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A5154578
22.5%
I2587057
11.3%
R2189035
9.6%
C1669540
 
7.3%
K1631603
 
7.1%
S1513007
 
6.6%
N1373406
 
6.0%
B1339795
 
5.9%
F966752
 
4.2%
Z936381
 
4.1%
Other values (13)3501948
15.3%
Common
ValueCountFrequency (%)
3428496
84.6%
1306361
 
7.6%
2274143
 
6.8%
818554
 
0.5%
514167
 
0.3%
69583
 
0.2%
4503
 
< 0.1%
-161
 
< 0.1%
/58
 
< 0.1%
336
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII26915164
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A5154578
19.2%
3428496
12.7%
I2587057
9.6%
R2189035
8.1%
C1669540
 
6.2%
K1631603
 
6.1%
S1513007
 
5.6%
N1373406
 
5.1%
B1339795
 
5.0%
F966752
 
3.6%
Other values (23)5061895
18.8%

Interactions

2022-11-08T14:12:05.910290image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:38.514367image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:41.177108image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:43.861771image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:46.642421image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:49.387779image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:52.074364image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:54.912089image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:57.616044image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:00.319211image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:03.182992image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:06.142622image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:38.800226image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:41.416139image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:44.093598image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:46.882699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:49.624497image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:52.312154image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:55.149343image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:57.852778image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:00.561032image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:03.426327image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:06.382003image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:39.037806image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:41.657085image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:44.333581image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:47.128603image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:49.879116image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:52.565626image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:55.392948image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:58.104666image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:00.812169image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:03.669467image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:06.622480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:39.275980image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:41.898181image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:44.581146image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:47.372200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:50.131463image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:52.809109image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:55.644763image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:58.362206image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:01.058489image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:03.914906image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:06.861833image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:39.514602image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:42.146431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:44.821927image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:47.621122image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:50.366030image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:53.058767image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:55.885269image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:58.615271image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:01.447594image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:04.161033image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:07.110480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:39.750272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:42.386355image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:45.059912image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:47.866379image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:50.617720image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:53.292623image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:56.126392image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:58.856783image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:01.687905image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:04.404649image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:07.347439image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:39.991029image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:42.627698image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:45.301711image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:48.116674image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:50.864671image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:53.679540image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:56.365849image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:59.098298image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:01.931821image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:04.661397image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:07.588963image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:40.233142image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:42.872098image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:45.547244image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:48.372478image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:51.105313image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:53.920064image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:56.611433image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:59.335553image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:02.178760image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:04.925186image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:07.835710image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:40.471257image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:43.117685image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:45.919711image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:48.627314image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:51.345881image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:54.155416image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:56.873805image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:59.577757image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:02.422464image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:05.182628image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:08.073408image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:40.709433image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:43.362896image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:46.160789image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:48.887556image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:51.595008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:54.415846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:57.117024image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:59.821665image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:02.674334image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:05.421989image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:08.320548image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:40.945009image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:43.616228image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:46.392985image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:49.155147image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:51.834559image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:54.676419image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:11:57.357698image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:00.060890image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:02.921557image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-08T14:12:05.673438image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-08T14:12:19.960094image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-08T14:12:20.119507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-08T14:12:20.287434image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-08T14:12:20.462535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-08T14:12:20.628775image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-08T14:12:20.781089image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-08T14:12:09.376878image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-08T14:12:11.112993image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-08T14:12:13.226380image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

unit_idbusiness_proposal_nosub_proposal_noinstall_noinstall_due_dateprincipal_amtmarkup_amtinsurance_amtfinance_chgsinstallment_chgstracker_chgsprincipal_recvmarkup_recvinsurance_recvfinance_chgs_recvinstallment_chgs_recvtracker_chgs_recvos_principalos_markupos_install_flaginstallment receiving dateno_of_receiptssession_idStatusLoan Settelment_DateProductDescription
020959273761[2014/03/07:12:00:00 AM]3469.01881.000003469.01881.0000046531.012369.0N[2014/03/08:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
120959273762[2014/04/07:12:00:00 AM]3412.01938.000003412.01938.0000043119.010431.0N[2014/04/07:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
220959273763[2014/05/07:12:00:00 AM]3612.01738.000003612.01738.0000039507.08693.0N[2014/05/07:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
320959273764[2014/06/07:12:00:00 AM]3705.01645.000003705.01645.0000035802.07048.0N[2014/06/10:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
420959273765[2014/07/07:12:00:00 AM]3907.01443.000003907.01443.0000031895.05605.0N[2014/07/08:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
520959273766[2014/08/07:12:00:00 AM]4022.01328.000004022.01328.0000027873.04277.0N[2014/08/08:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
620959273767[2014/09/07:12:00:00 AM]4189.01161.000004189.01161.0000023684.03116.0N[2014/09/11:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
720959273768[2014/10/07:12:00:00 AM]4396.0954.000004396.0954.0000019288.02162.0N[2014/10/15:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
820959273769[2014/11/07:12:00:00 AM]4547.0803.000004547.0803.0000014741.01359.0N[2014/11/11:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC
9209592737610[2014/12/07:12:00:00 AM]4756.0594.000004756.0594.000009985.0765.0N[2014/12/11:12:00:00 AM]0653770X[2015/02/16:12:00:00 AM]FINCA KAROBARI KARZA BASIC

Last rows

unit_idbusiness_proposal_nosub_proposal_noinstall_noinstall_due_dateprincipal_amtmarkup_amtinsurance_amtfinance_chgsinstallment_chgstracker_chgsprincipal_recvmarkup_recvinsurance_recvfinance_chgs_recvinstallment_chgs_recvtracker_chgs_recvos_principalos_markupos_install_flaginstallment receiving dateno_of_receiptssession_idStatusLoan Settelment_DateProductDescription
933963904269795913[2020/10/23:12:00:00 AM]7163.823670.1800000.00.0000079639.0217858.98YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
933964904269795914[2020/11/23:12:00:00 AM]7354.483479.5200000.00.0000072284.5414379.46YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
933965904269795915[2020/12/23:12:00:00 AM]7777.683056.3200000.00.0000064506.8611323.14YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
933966904269795916[2021/01/23:12:00:00 AM]8015.622818.3800000.00.0000056491.248504.76YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
933967904269795917[2021/02/23:12:00:00 AM]8365.832468.1700000.00.0000048125.416036.59YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
933968904269795918[2021/03/23:12:00:00 AM]8934.831899.1700000.00.0000039190.584137.42YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
933969904269795919[2021/04/23:12:00:00 AM]9121.721712.2800000.00.0000030068.862425.14YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
9339709042697959110[2021/05/24:12:00:00 AM]9520.261313.7400000.00.0000020548.601111.40YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
9339719042697959111[2021/06/23:12:00:00 AM]9965.17868.8300000.00.0000010583.43242.57YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS
9339729042697959112[2021/07/23:12:00:00 AM]10583.43242.5700000.00.000000.000.00YNaN04011582E[2020/09/24:04:28:36 PM]FINCA KAROBARI KARZA BASIC 12 MONTHS